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Abstract 

Infectious disease ecology has recently raised its public profile beyond the scientific community due to the 
major threats that wildlife infections pose to biological conservation, animal welfare, human health and 
food security. As we start unravelling the full extent of emerging infectious diseases, there is an urgent need 
to facilitate multidisciplinary research in this area. Even though research in ecology has always had a strong 
theoretical component, cultural and technical hurdles often hamper direct collaboration between theoreti- 
cians and empiricists. Building upon our collective experience of multidisciplinary research and teaching in 
this area, we propose practical guidelines to help with effective integration among mathematical modelling, 
fieldwork and laboratory work. Modelling tools can be used at all steps of a field-based research pro- 
gramme, from the formulation of working hypotheses to field study design and data analysis. We illustrate 
our model-guided fieldwork framework with two case studies we have been conducting on wildlife infec- 
tious diseases: plague transmission in prairie dogs and lyssavirus dynamics in American and African bats. 
These demonstrate that mechanistic models, if properly integrated in research programmes, can provide a 
framework for holistic approaches to complex biological systems. 
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INTRODUCTION 

Concluding a report on the controversy surrounding Ross and 
Waite's pioneering mathematical models for malaria transmission, the 
British Medical Journal pointed out, in 1911, 'the paradox that quanti- 
tative work based on false postulates may, by stimulating biologists 
and field workers to a closer scrutiny of the facts, sometimes lead 
more certainly to the discovery of the truth, than non-quantitative 
investigations resting on impeccable foundations'. A hundred years 
later, and despite the ubiquity of mathematical models in all fields of 
life sciences, the statement retains remarkable pertinence. By describ- 
ing mathematically the unobserved mechanisms hypothesised to be 
causing biological phenomena (e.g. the boom-and-bust nature of 
epidemics or the cyclical fluctuations of animal populations) in a 
Newtonian fashion, Ross and other early modellers finally brought life 
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sciences on a par with physical sciences. Mechanistic models have 
come to form the backbone of modern teaching and research in eco- 
logy. The mainstream use of mathematical models in ecology, some- 
times referred to as strategic modelling (Gurney & Nisbet 1998), aims 
to formulate simple descriptions of universal drivers of population 
dynamics. Over the last century, this approach has provided useful 
insight into general principles of ecology and has helped ecologists 
generate testable hypotheses. However, it is not always clear to scien- 
tists working on a particular system what added value such generic 
models can bring. This stems from the conceptual, technical and, at 
times, cultural difficulties faced when trying to match models with 
empirical information, as experienced by anyone working at the inter- 
face of theoretical and empirical research. 

In principle, mathematical models can be embedded in 
broader frameworks for scientific investigation based on hypothesis 
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generation and experimental falsification or validation. It is almost 
50 years since Piatt (1964) laid out the principles of strong infer- 
ence and advocated an iterative process of formulating multiple 
alternative hypotheses, generating testable predictions, gathering 
experimental evidence and then revisiting the hypotheses in view of 
the evidence. Along the same lines, mechanistic models should 
evolve with the experimental evidence generated through the scien- 
tific process. Various textbooks, such as Hilborn & Mangel's (1997) 
Ecological Detective, have greatly contributed to promoting model 
integration in population ecology. This iterative process has been 
put into practice in ecology in various contexts, notably experimen- 
tal planktonic prey— predator systems, where the systematic use of 
mathematical models has helped ascertain and quantify the respec- 
tive roles of diverse factors, including resource availability 
(Fussmann et al. 2000) and genetic composition (Becks et al. 2010), 
in the generation of complex trophic dynamics. Extending such 
approaches to natural populations poses many challenges, but there 
are examples of iterative implementation of observations, theory 
and experiments in field settings. Many of these have aimed to 
determine the drivers of population cycles, following a tradition set 
by Volterra (1926), and have encompassed a broad range of ecologi- 
cal factors and interactions: from the iconic Canadian lynx-hare 
predator— prey system (Krebs et al, 2001) to the Soay sheep popula- 
tion of St Kilda island (Coulson et al. 2001), and from the British 
red grouse and its parasites (Hudson et al. 1998) to the California 
red scale (a pest of citrus trees) and its parasitoid (Murdoch et al. 
2006). However, the common thread to all these examples is that 
only after long time series (several years if not decades) of data had 
been collected did theoretical questions appear, leading ultimately to 
a cycle of mathematical models and experimental validation. Ecolo- 
gists have generally failed to harness the power of mechanistic mod- 
els for study design and data integration during early phases of field 
studies, which can limit the power of data analysis and inference at 
later stages. 

What can we learn from retrospective studies to improve the flow 
of exchange between empirical and theoretical methods at the onset 
of a new programme of field-based research? This question is par- 
ticularly topical in infectious disease ecology. Indeed, beyond the 
academic motivation of integrating pathogens into ecological frame- 
works (Lafferty et al. 2008), wildlife infections have recently received 
a surge of attention in broader scientific and political communities 
because of the various threats they pose at the global level (Daszak 
et al. 2000): to the conservation of the species affected (Blaustein & 
Kiesecker 2002; Haydon et al. 2006; Frick et al. 2010); to ecosystem 
stability (Rizzo & Garbelotto 2003); to the viability and trade of 
livestock infected from wildlife reservoirs (Siembieda et al. 2011) 
and to public health in human populations affected by zoonotic dis- 
eases (Kuiken et al. 2005). From a scientific point of view, infec- 
tious diseases in wildlife are particularly challenging ecological 
systems because their dynamics are determined by processes operat- 
ing at multiple scales (Table 1), and because of practical difficulties 
with data collection in populations that are often difficult to 
observe or sample. Understanding these key processes requires 
input from many disciplines, using multiple methodologies and anal- 
ysing diverse datasets to triangulate the causal drivers of disease 
dynamics (Plowright et al. 2008). 

In this study, we emphasise the benefits of integrating mathemati- 
cal and statistical models with empirical and experimental 
approaches at all stages of a research project, with the first iteration 



Table 1 Multiple scales at which the dynamics of wildlife infectious diseases can 
be modelled 



Level 


Example 


References 


Within host 


Effect of maternal antibodies on 


Kallio et al. 




demography 


(2010) 


Within groups of 


Social network governing infectious 


Drewe (2010) 


hosts 


contacts between animals 




Between groups 


Metapopulation dynamics 


Haydon et al. 






(2006) 


Across 


Spatiotemporal waves of infection guided 


Russell et al. 


landscapes 


by natural barriers 


(2005) 


Between host 


Environmental reservoirs 


Haydon et al. 


species 




(2002) 




Pathogen-mediated competition 


Tompkins et al. 






(2003) 


Between parasite 


Interactions within parasite communities 


Telfer et al. 


species 




(2010) 


From wildlife to 


Risk factors for zoonotic emergence 


Jones et at. 


humans 




(2008) 



preceding data collection. This not only improves hypothesis gener- 
ation and study design but also increases the quantity and quality of 
information gained from empirical studies. To this aim, we present 
a practical framework called model-guided fieldwork (MGF), which 
demands a rational dialogue between researchers from multiple dis- 
ciplines through a series of iterative steps, ultimately leading to 
improved causal inference and predictive power. While the vast 
majority of ecologists are aware of the usefulness of mathematical 
models, a lack of interdisciplinary expertise within research teams all 
too often prevents models from being used to their full potential. 
We aim to provide guidelines for 'good practice' in multidisciplinary 
ecological research, largely inspired by our own experience in 
wildlife disease ecology. 

In the MGF framework, biologists and modellers collaborate at 
all stages of the study, from initial model formulation and field 
study design, to data collection and analysis. The MGF approach 
recognises that there is often uncertainty in system structure and 
drivers, and utilises a priori mechanistic models to ensure field 
efforts can address this uncertainty. A key strength of the MGF 
approach is the planned iterative refinement of fieldwork, laboratory 
experiments and modelling throughout the project, ensuring empiri- 
cal studies are more focused and models are data driven and appro- 
priate to the specific system. Compared to after-the-fact modelling 
approaches, MGF helps to focus field studies on the most impor- 
tant structures and drivers of dynamics. Furthermore, the necessary 
continuous dialogue between collaborators throughout the project 
lifespan fosters a multidisciplinary, multidirectional flow of informa- 
tion (Fig. 1). It is important to underline that we envisage the 
framework shown in Fig. 1 as a strategic master plan that may 
unfold over several years, and which would be broken down into a 
number of smaller studies. However, it would be misleading to con- 
sider the large multidirectional framework as the mere juxtaposition 
of separate unidirectional studies. In the MGF programme, model- 
lers and biologists are involved in all steps, creating feedback loops 
that are missing from too many studies. 

Below we describe the MGF framework in detail, emphasising 
the practical contributions that modellers can make to the concep- 
tion, design, implementation and analysis of field studies. As illus- 
trated in Fig. 1, the five key stages presented here should not be 
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Figure 1 Schematic diagram of our model-guided fieldwork framework, emphasising feedback loops between empiricists and modellers. Numbers refer to the five key 
steps described in the text; however, it is possible to initiate the collaborative cycle at any stage. Dark grey boxes represent tasks led by modellers, light grey rounded 
boxes tasks led by biologists and a gradient indicates shared responsibility (online version in colour: blue for modellers' tasks, green for biologists' tasks). 



seen as a linear process, but as steps on a cycle of interactions that 
can be initiated from any point and reiterated multiple times. We 
then present two case studies illustrating different components of 
the MGF framework. Finally, the discussion summarises the scien- 
tific benefits of the approach and casts light on the inner workings 
and the challenges faced when breaking barriers between traditional 
disciplines. The level of involvement required from all parties goes 
against the high level of specialisation prevalent in science; we 
provide some suggestions about planning and organisation in a 
multidisciplinary context, with insight from social sciences. 

DESCRIPTION OF THE MGF FRAMEWORK 

Ecological model generation: from the conceptual to the 
mathematical 

Mechanistic models are a formalisation of the hypothesised pro- 
cesses that drive the observed dynamics of a biological system. 
Some of these processes may be measured in real time in the field 
(e.g. births), others may be measured in controlled conditions (e.g. 
duration of infectious period), while others may not be observable 
(e.g. transmission of infection from individual to individual). The 
first step of MGF is to integrate the evidence-based, qualitative and 
quantitative descriptions of these processes into a formal mathemat- 
ical model that attempts to describe the dynamics of observed vari- 
ables — in disease ecology, typically numbers of individuals in 
different categories (e.g. age, sex, infectious or immunological sta- 
tus). Particularly when limited information is available about the 
nature of a process, one should consider multiple hypotheses in a 



strong inference approach (Piatt 1964); instead of trying to falsify a 
single hypothesis, it is often more informative to formulate a com- 
prehensive set of biologically plausible, alternative hypotheses and 
assess their relative merits to explain available data. For example, 
the relative importance of multiple routes of transmission (Webb 
et al. 2006; Rohani et al. 2009) or multiple drivers of epidemic 
cycles (Wearing & Rohani 2006) can be assessed using mechanistic 
modelling approaches once data have been collected. Although 
more heuristic methods for hypothesis generation can be used, 
MGF forces researchers to be extremely specific in detailing their 
questions of interest and underlying assumptions. This specificity 
helps to ensure that the data collected will be appropriate for the 
analyses planned later (Fig. 1). 

Once a conceptual mechanistic model of the system has been 
proposed and formalised in diagrams and verbal description, theore- 
ticians translate it into a mechanistic mathematical model, checking 
the appropriateness of every assumption with the interdisciplinary 
team. Where existing modelling frameworks do not appear suitable, 
novel, tailor-made model structures can be designed, or a combina- 
tion of models may be used for different parts of the system. 
Designing model structures (i.e. equations describing the temporal 
or spatiotemporal changes in biological variables) will high- 
light essential parameters and appropriate methods of estimation; 
either direct measurement or indirect inference by fitting models 
to the data. Identification of the parameters in itself guides the 
experimental and field study design (Fig. 1). 

Modellers must ensure that the parameters are correctly inter- 
preted. For quantities that can be measured in the field, it is impor- 
tant to check that the mathematical parameters have the same 
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dimension as the measurements. Conversely, when the model 
includes parameters that cannot be measured directly, the team 
should try to find practical ways in which the estimates of those 
parameters, once fitted, can be validated indirectly. For example, in 
a model for epidemics, the transmission rate in itself is typically of 
little practical use, but it can be combined with other parameters to 
form the basic reproductive rate of the epidemic, which has an intu- 
itive interpretation and practical implications. Because the correct 
interpretation of a parameter depends on the way it is included 
mathematically in the equations, it is also important to discuss 
the choice of alternative functions describing key processes in the 
ecological system considered. Partially specified models (Wood 
2001) can also be considered when there is uncertainty in the choice 
of mathematical functions. 

In some cases, the choice of model structure may not be straight- 
forward. For example, most models for the population dynamics of 
infectious diseases assume that individuals will go through a small 
number of discrete states during the process of infection, typically: 
susceptible (uninfected), latent (infected but not infectious), infec- 
tious and recovered (often with lifelong protective immunity). Even 
if the studied infection follows this general progression, individuals 
may not be unambiguously assigned to a single status (McClintock 
et al. 2010); for example, it is often impossible to know from a mea- 
sured antibody titre whether the animal is currently infected or 
whether it is immune against future infection. Complementary 
experimental data are often needed to provide this kind of informa- 
tion, and further model refinement may be required to incorporate 
this biological richness (Charleston et al. 2011). The most relevant 
level of aggregation of individuals into discrete categories must be 
discussed carefully, taking into account the accuracy of the measure- 
ments available, the complexity of the model and the corresponding 
statistical methods. If not considered properly, these issues can lead 
to an ill-posed question, i.e. a situation where the results are of no 
practical use, as reviewed by Loehle (2011). 

Model exploration 

Once a draft model is constructed, its dynamics must be explored 
over a wide range of parameter values and alternative assumptions 
using mathematical analysis and numerical simulations (e.g. Plo- 
wright et al. 2011). Patterns that the model can generate, and quali- 
tative features expected in the data (Duke-Sylvester et al. 2010) can 
be identified, such as the existence and stability of equilibrium 
states, steady or waning oscillations (Hampson et al. 2007) or extinc- 
tion of either the host or the infectious agent (Lloyd-Smith et al. 
2005). As described in the next section, this initial analysis will 
inform the design of the field study (e.g. how many observations or 
samples are required to detect hypothesised changes in disease 
dynamics?), and can highlight flaws in the model (e.g. if it fails to 
replicate known dynamics, such as oscillations, or predicts an unre- 
alistic prevalence of infection). This 'feasibility assessment' stage 
(Fig. 1) helps the scientific team identify aspects of the system that 
are poorly understood and need further empirical study or updates 
to the model structure. 

Sensitivity analysis is another essential process that helps focus 
data collection effort on the most important parameters, by deter- 
mining how changes in parameters affect model output (Blower & 
Dowlatabadi 1994; Cariboni et al. 2007). Highly sensitive parameters 
may require more measurement effort because the model output is 



more dependent on these parameters. From a mathematical view- 
point, sensitivity analysis should explore most of the parameter 
space. The biological range of some parameters can be quite wide, 
especially when considering a variety of environmental conditions. 
In many cases, however, the most relevant biological information 
results from a more local sensitivity analysis in the parameter space, 
which can be loosely determined from previous knowledge. Local 
sensitivity analysis is useful because there is often interdependence 
in parameter values and sensitivities, and this informed process 
can help to reduce data collection effort by focusing on the most 
relevant region of the parameter space. Determining interactions 
between parameters can also help pre-empt issues with identifiability 
that may appear at a later stage, by suggesting simplifications in the 
model. A typical example would be a pair of parameters governing 
reproduction and mortality in perfectly symmetric ways, which 
could result in population dynamics affected by the ratio (or the 
difference) of the two parameters rather than their absolute values: 
the pair of parameters can then be replaced with a single aggregate 
parameter. 

Study design 

The design of ecological studies should aim to maximise the 
information that can be obtained from the data within the practical 
constraints imposed by the system. Although sample size calculations 
have become a standard practice in life sciences, mechanistic models 
are rarely used at this stage of empirical research. Once a priori 
models have been developed, as outlined in the previous section, 
they can help suggest how field data should be collected to optimise 
integration with other data sources. For example, De Jong & Bouma 
(2001) described a practical experimental framework for the measure- 
ment of vaccine-induced herd immunity in animal populations, based 
on a generic mathematical model for disease transmission. 

Using mechanistic models ahead of field studies can improve the 
reliability of data collection. Craft et al. (2009) built a network 
model based on data describing the social interactions among Seren- 
geti lions, and used it to run simulations of disease outbreaks. They 
quantified the impact of several properties of the empirically derived 
network on the outcome of the simulations, and highlighted poten- 
tial biases caused by the way data had been collected. Although this 
analysis was conducted on a post hoc basis, such advice can help with 
the design of future field studies. 

Furthermore, modelling exercises can suggest additional measure- 
ments that had been overlooked. Rohani et al. (2009) demonstrated 
theoretically that environmental transmission could play a more sig- 
nificant role than previously acknowledged during outbreaks of 
avian influenza, which should encourage measurements for the pres- 
ence of influenza viruses in the environment. In another example, 
Plowright et al, (2011) developed a metapopulation model to simu- 
late the dynamics of Hendra virus within fruit bat populations, thus 
providing a mechanistic explanation of increasing spillover from 
fruit bats into domestic horses in Australia. The inclusion of waning 
maternal immunity in the model improved the temporal match of 
simulated outbreaks to the observed ones. Although the presence 
of maternally derived antibody (MDA) has been reported, measure- 
ment of waning immunity itself in wild animals may not be feasible; 
instead experimental studies on captive bats could be carried out to 
test the hypothesis that MDA is protective and to derive empirical 
estimates of the rate of MDA decline. 
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Model fitting 

A crucial step of MGF comes when the model has been designed 
and the data have been collected: matching the two together with 
the help of statistical modelling. The first objective is usually to esti- 
mate the value of model parameters that were not known at the 
time of model construction. In the case where alternative hypothe- 
ses have been incorporated into different models, an additional 
objective is to assess which hypothesis provides the 'best fit' of the 
model to the data. What is meant by 'best fit' is an essential ques- 
tion that needs to be addressed by all the parties involved, biolo- 
gists, modellers and statisticians, even before the data have been 
collected. Although a number of methods for curve fitting, such as 
smoothing, least squares or non-linear forecasting, have been tradi- 
tionally employed for ecological time series (Kendall et al. 1999), 
they tend to consider the sources of error and variability as black 
boxes. In contrast, likelihood-based models can incorporate specific 
error-generating mechanisms (e.g. demographic stochasticity, sam- 
pling methods, imperfect assays), and therefore generate more reli- 
able predictions (Clark & Bjornstad 2004). Combined with 
information criteria (such as Akaike's Information Criterion), they 
also allow multiple model comparison and weighting (Burnham & 
Anderson 2001). Likelihood-based methods allow the computation 
of confidence intervals on parameter estimates, providing evidence 
to compare the relative importance of the mechanisms considered. 

The last two decades have seen considerable advancement in the 
statistical methods available to fit dynamic models to empirical data, 
especially in the field of infectious diseases (Becker & Britton 1999). 
An increasingly popular approach is to use a Bayesian framework, 
which offers several advantages. First, all model parameters follow 
probability distributions rather than being treated as fixed quantities. 
This enables measures of parameter uncertainty to be generated. 
Second, the Bayesian framework allows prior information (e.g. 
parameter ranges obtained from available data sources, such as pub- 
lished papers) to be incorporated into the model structure. This is 
somewhat controversial as poor choices of priors can unduly influ- 
ence the final estimates. However, where reliable information is 
available, a careful use of priors can help to ensure that the parame- 
ter estimates obtained are realistic. Indeed, certain combinations of 
parameter values that fall outside of meaningful biological ranges 
might happen to produce a perfect fit to the data by virtue of the 
mathematical properties of the model, but the use of prior informa- 
tion can help maintain parameters within biologically appropriate 
ranges. Third, in the Bayesian framework, any missing information 
in the data (e.g. due to incomplete observations or long time inter- 
vals between repeated measurements) can be treated as extra param- 
eters in the model, and estimated as part of the model fitting 
process (O'Neill & Roberts 1999; Clark & Bjernstad 2004). Recent 
methods have also been developed to deal with missing information 
within a frequentist framework (Ionides et al. 2006). Other new 
techniques, such as approximate Bayesian computation, provide a 
natural framework to estimate parameters in stochastic ecological 
models (Hartig et al. 2011), which are particularly relevant for infec- 
tious disease dynamics. 

Whether using Bayesian or frequentist statistical models, a constant 
dialogue must be maintained between all parties involved. Indeed, 
the fitting process must be informed by the biologists to ensure the 
data are correctly interpreted and meaningfully analysed. It is not 
uncommon that data collection differs in various degrees from the 



initial plans because of logistical issues or unexpected field condi- 
tions. This may require a reassessment of the fitting procedures to 
account for missing data. It also is essential to submit every output 
from data analysis to a reality check by the field biologists: for 
example, unrealistic parameter estimates might reveal flaws in mod- 
els, guiding the selection of alternative models or the revision of 
unsuitable assumptions. As a result, it is not unusual for the process 
of fitting models to data to take several months to complete. The 
numerical algorithms involved are often very complex, with risk of 
human error, and can take several days to run, even on modern 
computers. 

Model validation 

The first step of validation is the assessment of the 'goodness of 
fit' of the model(s). Even though model fitting procedures aim to 
minimise the difference between observations and model predic- 
tion and to select the best-supported model, substantial discrepan- 
cies may remain. Statistical tests can be used to assess whether 
the remaining differences (or residuals) between the fitted model 
and the actual data may be attributed to random noise. However, 
such tests must be interpreted with caution: statistical support does 
not guarantee that the model assumptions are correct; conversely, a 
statistically significant discrepancy should not necessarily lead to a 
rejection of the mechanistic model as a whole. The value of any 
model lies in its ability to improve our understanding of specific 
processes, which does not necessarily require a perfect match to all 
the mechanisms of the real system. Therefore, a subjective assessment 
of model dynamics, informed by biological knowledge, remains 
important — this is another 'reality check' at the core of the MGF 
process (Fig. 1). Predictions from a fitted model should always be dis- 
cussed critically in the context of both the model structure and the 
data collected. 

The second step of validation confronts predictions of the model 
with an independent set of data, i.e. data not used in the fitting pro- 
cedure. For example, if the same variables have been measured in 
two different locations that differ in known characteristics, once the 
model has been fitted to the data from one site, it can be used to 
predict the observations in the other location by modifying certain 
parameter values accordingly. Alternatively, where several variables 
have been measured in a single study, it is possible to fit the model 
to a subset of variables and then use the model to predict the 
dynamics of the remaining variables. If successful, this step is strong 
evidence that some fundamentally important aspects of the system 
have been captured by the mechanistic model. This allows the 
researchers to draw conclusions about the causes of observed pat- 
terns. Thus, once a valid model is produced, it will help assess the 
original set of hypotheses. If the model fails this step of validation, 
it is worth reconsidering alternative models that may have been pro- 
posed in the early stages and discarded on the basis of the original 
fitting procedure. However, as in the previous step of validation, 
some discrepancies should be expected when confronting model 
predictions to independent data: again, this should not trigger an 
automatic rejection of the model, but rather encourage a critical 
assessment of its assumptions through discussion between modellers 
and empirical biologists. In particular, the magnitude of the discrep- 
ancies that can be accepted should be informed by the biology of 
the system, by potential uncertainties introduced by data collection 
procedures, and by the type of model-generated predictions sought. 
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This step of validation can result in modifications of the model as 
well as additional experiments; these iterative improvements are at 
the core of the MGF philosophy. 

Lasdy, a third component of model validation is an assessment of 
the relative importance of the model parameters. There are two 
relatively common approaches for this. The first one uses an infor- 
mation criterion framework that penalises the explanatory power of 
models by their complexity (number of parameters), as described in 
step 4 above for model selection. Once the best model has been 
identified, it may still be possible to simplify it further, e.g. by set- 
ting the values of some parameters to zero and assessing the effect 
on the information criterion. The second approach is to use sensi- 
tivity analysis on a single validated model to determine the relative 
importance of different processes incorporated in that single model. 
In this approach, parameters are associated with particular processes 
of interest (e.g. transmission pathways). Parameters that strongly 
affect the behaviour of models (i.e. which have high sensitivity) are 
associated with relatively important processes. 

By submitting fitted models to detailed scrutiny, the scientific 
team should aim to identify any remaining discrepancies and issues, 
leading to further iterations of model improvement and data collec- 
tion. Components of the models that failed the validation tests can 
be more informative than those that passed, which echoes the 1911 
quote cited in the opening of this study. Indeed, a key aspect of 
MGF is that the model is not an end point. As in a strong infer- 
ence perspective (Piatt 1964), the inference drawn from the model 
should generate new questions and hypotheses that can be investi- 
gated through another cycle of MGF (Fig. 1). 



CASE STUDIES 

The MGF framework represents a flexible and modular set of 
guidelines for ambitious, long-term research programmes in ecology. 
In most cases, such programmes will progress through a series of 
specific research objectives corresponding to subsets of the whole 
framework. Appreciating from the onset how mathematical models 
can contribute to those different steps is not an obvious task, but 
one that requires patience and commitment from all parties 
involved. As mentioned in the introduction, the MGF framework 
is not meant as a rigid, one-size-fits-all set of rules, but as a guide- 
line for multidisciplinary integration. To demonstrate how MGF 
research can be implemented from different premises, we present 
detailed accounts of this process using two case studies from our 
own research. Whereas the first case study (plague transmission in 
prairie dogs, Fig. 2) followed steps 1—5 quite naturally, the second 
one (lyssaviruses in bats, Figs 3 and 4) combines two initially inde- 
pendent projects which started at different points along the cycle 
until it was realised they could complement each other within an 
MGF approach. 

Plague transmission in prairie dogs 

Black-tailed prairie dogs [Cynomjs ludovicianus) are extremely suscepti- 
ble to plague and exhibit epizootic die-offs resulting in the apparent 
extinction of prairie dog towns. Researchers wished to determine 
the mechanisms underlying these spectacular die-offs to better 
understand how they occur and might be managed. According to 
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Figure 2 Application of MGF to plague {Yersinia pestis) in prairie dogs {Cynomjs ludovicianus}. The aim was to assess the relative importance of different routes of 
transmission of Y. pestis in causing observed extinctions of prairie dog towns. 
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Figure 3 Application of MGF to bat rabies virus in big brown bats (Eptesicus fuscus). This study aimed at assessing the role of seasonal hibernation and birth pulses in the 
persistence of rabies virus in bat populations. Here, most data were collected before a formal mathematical model was developed. 
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Figure 4 Application of MGF to Lagos bat virus (LBV) in straw-coloured fruit bats (Eidolon helvum). This ongoing research programme is investigating the interactions 
between bat life history (age structure, migrations, seasonal birth pulses) and the circulation of LBV. The modelling framework is being developed in conjunction with 
data collection and with input from the project on rabies virus in big brown bats. 
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the dominant paradigm in the literature, Yersinia pestis, the aetiologi- 
cal agent of plague, forms a biofilm that blocks the proventriculus 
of fleas feeding on mammals; transmission occurs when infectious 
bacteria from the blockage are regurgitated. Models that assumed 
blocked flea transmission as the only process predicted flea loads 
that were inconsistent with those observed in the field (Lorange 
et al. 2005; Webb et al. 2006). From these results and other observa- 
tions, field researchers questioned the relevance of the dominant 
assumption to the prairie dog system, especially because the para- 
digm was based on transmission in a peridomestic system that dif- 
fers in several important features from plague outbreaks in wildlife 
populations (Gage & Kosoy 2005). 

Plague has a rich and long scientific literature including many 
older studies with valuable information from detailed laboratory and 
field observations for different species of rodent hosts and flea 
vectors. Under closer scrutiny by empirical researchers and model- 
lers, the literature revealed an array of proposed transmission 
mechanisms that could broadly be classified into three types: 
blocked flea transmission, pneumonic transmission and transmission 
from a short-term reservoir. This led to the development of a 
mechanistic model that incorporated those three alternative trans- 
mission pathways (Webb et al. 2006). Most model parameters were 
taken from literature on prairie dogs or closely related species, 
although field data were used to estimate three remaining parame- 
ters using multiple methods including fitting procedures to the pro- 
posed model and estimation separate from the model. A stochastic 
version of the model was used to predict the probability of extinc- 
tion of prairie dogs and fleas and the time to extinction. These pre- 
dictions were validated using an independent 20-year data set of 
observed outbreaks. The model achieved a reasonable match to the 
observed data, and sensitivity analysis revealed that transmission 
from a short-term reservoir was the only route consistent with the 
observed data. This led to specific recommendations for data collec- 
tion, initiating a new cycle of MGF (Fig. 2). In particular, multiple 
hypotheses were consistent with the short-term reservoir scenario 
and constraints on the infectious period of the short-term reservoir 
predicted by the model, including early-phase transmission before 
blockage occurred, transmission from carcasses and transmission 
from alternative infected hosts. 

Based on the model, a series of experiments established that 
early-phase transmission was feasible in a laboratory setting (Eisen 
et al. 2006; Wilder et al. 2008), and measured the decay of infec- 
tiousness from carcasses. In parallel, data were also collected on 
infection rates in alternative hosts (Stapp & Salkeld 2009). This field 
and laboratory work corresponds to Phase 3 in MGF (Fig. 2). 
Finally, the new data were incorporated into more specific models 
that accounted for the newly proposed transmission mechanisms 
and which were then validated (Salkeld et al. 2010; Buhnerkempe 
et al. 2011), corresponding to Steps 4 and 5 in MGF (Fig. 2). The 
two different models that were developed need to be reconciled, 
but overall it appears likely that early-phase transmission drives the 
initial spread of epizootics with secondary roles for other transmis- 
sion routes once host limitation occurs (Buhnerkempe et al. 2011). 

Seasonal dynamics of lyssaviruses in bats 

The role of bats as reservoirs of zoonotic viral infections is increas- 
ingly recognised (Calisher et al. 2006). Rabies virus and related lyss- 
aviruses are important pathogens of bat origin (Badrane & Tordo 



2001); however, mechanisms of persistence of lyssaviruses in popu- 
lations of bats and the drivers thereof have not been well described. 
We have recently led two projects investigating the effects of two 
different seasonal behaviours on bat lyssavirus infection dynamics: 
hibernation in a temperate bat species and migration in a tropical 
bat species. In contrast to the previous case study which was a 
direct illustration of the MGF guidelines, this one shows the syn- 
ergy that can be gained by combining several threads of research 
into the MGF framework. 

Hibernation and rabies virus infection in big brown bats 
The first part of our case study focused on rabies virus persistence 
in a big brown bat {Eptesicus fuscus) population roosting in buildings 
in Fort Collins, Colorado. In this study, modellers relied on a post 
hoc synthesis of field and laboratory data to develop a population- 
level model for the seasonal dynamics of rabies virus, which was 
then validated with independent data (George et al. 2011). Thus, this 
project effectively implemented Steps 3, 4 and 5 of the MGF 
framework (Fig. 3). The project was initiated by field and laboratory 
scientists aiming to estimate host demographic parameters that were 
previously unavailable. From the onset of the 5-year field phase of 
the project, empirical estimates of bat demographic and infection- 
related parameters were obtained (Step 3). Field work included 
radio-tracking of bats and permanently tagging (Wimsatt et al. 2005) 
several thousand individual bats at multiple colonies, which pro- 
vided estimates of survival rates (O'Shea et al. 2011b). Reproductive 
rates were quantified by assessing the breeding status of captured 
females (O'Shea et al. 2010). The project also considered epidemio- 
logical parameters of the bat population, including determination of 
rabies seroprevalence and assessment of infection prevalence. Inter- 
pretation of serology data was helped by exposure experiments on 
captive big brown bats (Shankar et al. 2004; Davis et al. 2007; Turm- 
elle et al. 2010) and a statistical model for estimating the rate of 
exposure based on seroprevalence and seroconversion data from 
marked bats (George et al. 2011). 

The mathematical model consisted of three submodels that 
described the hibernation period (when no transmission occurs), 
pre-transmission period in early spring and the main transmission 
period (when transmission was assumed to follow a classical SEIR 
framework). Other structures representing alternative hypotheses 
were considered for the pre-transmission period where less was 
known. Combined results of demographic and serological sampling 
in the field, results of experimental exposure studies and informa- 
tion from the literature allowed modellers to estimate or bound 
model parameters. The model was validated with independent data 
from the study population, including population size (O'Shea et al. 
2011a), size of the infectious class (George et al. 2011) and the tim- 
ing of the peak number of rabies cases in different age classes 
(O'Shea et al. 2011a). 

The model suggests that rabies virus is maintained in the popula- 
tion because the system is essentially in stasis during the hibernation 
period, which allows persistence of infection until the spring birth 
pulse (George et al. 2011). Although this project did not begin with 
the design of a mathematical model, the field and experimental biol- 
ogists had an unusually clear conceptual model initially and worked 
closely with modellers as data collection was ending. Although, in 
hindsight, this project could have benefitted from an earlier involve- 
ment of the modellers, it has demonstrated how empirical research- 
ers and modellers effectively iterated several steps of MGF post hoc 
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(Fig. 3). The resultant model would not have been possible or com- 
pelling without the field and laboratory data generated at different 
stages of the project; the modelling allowed integration of empirical 
information to ensure that new insights were made with regards to 
rabies maintenance in the host population and tested alternative 
hypotheses regarding the pre-transmission season. In addition, this 
provided us with a starting point to develop a model for our next 
study. 

Migration and LBV in straw-coloured fruit bats 

The second part of our case study on bat-lyssavirus systems con- 
cerns Lagos bat virus (LBV), a lyssavirus that has been isolated 
from the African frugivorous bat Eidolon helvum (Boulger & Porter- 
field 1958). This project exemplifies how emerging infectious dis- 
ease research can be undertaken when little prior knowledge exists. 
A high prevalence of antibodies against LBV had been detected in 
this migratory species (Hayman et al. 2008; Kuzmin et al. 2008), 
which led us to formulate three overarching questions. Given that 
most lyssaviruses were thought to have high lethality rates in 
any mammalian host they infect, could LBV remain endemic in 
E. helvum? How would the spatiotemporal structure of bat popu- 
lations at the continental scale affect viral persistence? Could 
E. helvum act as a reservoir for zoonotic spillover into human and 
livestock populations? Eidolon helvum often roosts in enormous colo- 
nies reaching several million bats (Sorensen & Halberg 2001), which 
have been documented to form and disperse seasonally across sub- 
Saharan Africa, in both urban and rural environments. Based on 
this prior knowledge, we set out to address our three questions, 
with an initial focus on the capital of Ghana, Accra, home to a 
large colony of E. helvum in close contact with city-dwellers. We 
have concentrated our attention on two main factors: heterogeneity 
in host populations (e.g. variations in prevalence with location or 
age) and potential seasonal forcing of transmission caused by the 
seasonal migratory and reproductive behaviours of E. helvum. We 
have been addressing these issues following the MGF framework 
(Fig. 4) and using to our advantage the knowledge acquired through 
the big brown bat study. 

Using the limited knowledge available on the life cycle of 
E. helvum (Mutere 1967) and the dynamics of other bat lyssaviruses 
(Mondul et al. 2003; Harris et al. 2006), we initiated the project with 
a simple ecological model (Fig. 4, Step 1), based on the hypotheses 
that LBV infection in E. helvum can be transmitted horizontally 
within bat populations and results in protective immunity, given the 
high seroprevalence detected. Given the lack of empirical informa- 
tion on those two hypotheses, the model-design process generated a 
large number of more specific questions concerning, in particular, 
the existence of protective maternal antibodies, the duration of 
immunity and the lethality of infection. The absence of quantitative 
information on the ecology and demography of the bats meant that 
demographic studies were also necessary. Thus, over a few years, 
we iteratively accrued data and parameter estimates through diverse 
field studies of the host, the pathogen and their environment, com- 
bined with a series of demographic and epidemiological mathemati- 
cal models (Fig. 4, repeated loops between Steps 2 and 3). 

We have been conducting in parallel field-based studies of wild 
populations, surveys of captive bats maintained in semi-natural con- 
ditions and laboratory-based development of immunological and 
virological assays (Hayman et al. 2011), all guided by and feeding 
back into the modelling framework. The large colony sizes of 



E. helvum roosts largely exceed the number of bats that can be stud- 
ied using a traditional capture-recapture marking survey. Simulations 
based on expected survival rates suggested that radio-telemetry as a 
method of redetection of around 100 tagged bats in the colony over 
a whole season would give reliable estimates of survival (Hayman 
et al. 2012b). Other approaches, such as the estimation of ages using 
tooth-cementum ring analyses, have enabled us to estimate age- 
specific seroprevalence, and hence infection rates (Hayman et al. 
2012a). To address the unresolved issue of protective acquired 
immunity in bats, a captive, wild-caught colony is now being stud- 
ied. By sampling these bats regularly, we are seeking to determine 
whether serological parameters vary over time, whether all individu- 
als are born susceptible and whether seropositive bats may be per- 
sistently infected. Alongside this ongoing empirical work, we are 
now adapting the seasonal model developed by George et al. (2011) 
for rabies virus in big brown bats, to LBV in E. helvum. Once data 
collection from captive bats is complete, we will be ready to move 
to Step 4 and fit our updated transmission model to empirical data. 

DISCUSSION 

We have presented a detailed framework for MGF and described, 
through case studies, different methods of implementation. Key dif- 
ferences between MGF and more commonly applied approaches to 
modelling ecological dynamics are the early-stage input from both 
modellers and biologists into study design, incorporation of multiple 
hypotheses and uncertainty about structure in the determination of 
the data required and the iterative approach between models and 
measurement. Post hoc modelling studies have an important role to 
play to complement traditional data analysis and generate new pre- 
dictions. However, such studies tend to be limited in their scope 
and power if the modellers were not involved in study design. 
A lack of mutual understanding or communication between the 
modellers and the scientists who collected the data can cause 
lengthy adjustments of the model, can result in parts of the data 
being unusable due to their inadequate collection or reporting and 
may bring into question the reliability of the predictions generated. 
When undertaken properly, MGF allows the assessment of multiple 
system-specific hypotheses that relate to unobserved mechanisms, 
by combining information from different organisational levels (indi- 
vidual hosts, populations and landscapes). This information often 
comes from diverse sources, such as field surveys, laboratory experi- 
ments and surveillance, that can be integrated into the modelling 
framework. While we put fieldwork at the core of MGF as the fun- 
damental source of data and observations on wildlife ecology, our 
framework can incorporate complementary data sources, such as 
experiments involving captive animals. As a result of this broader 
integration, the data gathered will be exploited to their full potential 
and will thus lead to richer and more reliable conclusions and 
predictions. 

Beyond the technical aspects that we have described in detail, 
MGF provides a framework for adaptively managing the human 
dimension of interdisciplinary research collaborations. Indeed, the 
iterative process allows input from all disciplines at multiple points, 
helping to resolve or avoid altogether counter-productive situations 
where modellers are asked to develop a model without sufficient 
data or where field biologists are handed a model with inappropriate 
assumptions. However, even within this framework, there are criti- 
cal capabilities that must be developed within the research group as 
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a whole to ensure an efficient and durable collaboration. First, the 
group must define common goals and questions (Gorman et al. 
2011), which is not as easy to achieve as it may sound. Too often 
in research, interdisciplinary collaborators draw targets around their 
current personal research interests rather than defining superordi- 
nate goals and then agreeing on the means to achieve them. The 
MGF approach essentially forces participants to develop shared 
questions and goals before the research is undertaken. Second, 
members of the collaboration need to develop 'moral imagination' 
which involves seeing the problems from the perspective of other 
stakeholders (Gorman et al. 2011). Often researchers from different 
fields will have fundamentally different perspectives on how to 
tackle a problem, aptly described by Gorman et al. (2001) who sta- 
ted that 'most scholars like to share frameworks about as much as 
they like to share toothbrushes'. Hence, the third capability, devel- 
oping 'trading zones' (Galison 1997), is necessary for exchanging 
ideas or sharing data and resources. All three of these capabilities 
can be facilitated by 'interactional experts' — individuals who under- 
stand enough of the disciplinary cultures and languages to facilitate 
a common language, common goals, shared mental models, 
exchange of knowledge and a shared framework for investigation 
(Gorman 2010). Such expertise has traditionally been gained over 
long periods of personal collaborations, but more opportunities are 
now available for early-career training through formal courses and 
workshops. For example, in MGF, interactional experts would 
include veterinarians with post-graduate training in mathematical 
modelling, immunologists with training in ecology or mathemati- 
cians with a training in epidemiology. Both offer and demand for 
such interdisciplinary training need to be encouraged. 

Now is an exciting time to implement MGF approaches. There is 
increasing emphasis on the need for multi- or interdisciplinary stud- 
ies of many systems, particularly regarding the emergence of infec- 
tious disease threats to biodiversity and public health (Wolfe et al. 
2007; Jones et al. 2008). Many national and international funding 
bodies have started to support actively the integration of empirical 
research with modelling, or model-guided predictions, not only in 
their core programmes but also increasingly in direct response to 
emerging epizootics (e.g. white nose syndrome in North American 
bats) or zoonoses (e.g. pandemic swine influenza). One area where 
real progress can be made at low additional cost through MGF 
approaches is pathogen or disease surveillance — a point apparently 
overlooked by Kuiken et al. (2005) in their call to arms to tackle 
emerging zoonotic infections. Such approaches to surveillance 
would potentially increase the return on investment by addressing 
traditional surveillance questions as well as more mechanistic ones. 

Although we have focused our attention on wildlife infections, 
the MGF framework would be equally useful in other areas of 
applied ecological research where mechanistic models could help 
devise quantitative predictions for intervention, as indicated by ret- 
rospective reviews, e.g. on pest management (Murdoch et al. 2006). 
Potential applications include the management of invasive species, 
conservation of biodiversity in the face of climate change or the 
sustainability of human exploitation of natural resources. Advice to 
policy makers in these fields all too often relies on 'expert opinion', 
a euphemism for the subjective synthesis of vast amounts of scien- 
tific evidence and personal experience. By providing a rigorous 
framework for the construction of such syntheses, the MGF 
approach has the potential to become a standard for evidence-based 
environmental policies. Explicit embedding of MGF approaches into 



policy directed programmes can be achieved using frameworks such 
as participatory impact pathways analysis (Alvarez et al. 2010). A 
clearly defined pathway towards a successful outcome can provide 
incentives for all scientists to engage in what for many could be an 
unfamiliar and uncomfortable, albeit hugely exciting, process. 
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